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Abstract 

We address the excess entropy, which is a measure of complexity for 
stationary time series, from the ordinal point of view. We show that 
the permutation excess entropy is equal to the mutual information 
between two adjacent semi-infinite blocks in the space of orderings for 
finite-state stationary ergodic Markov processes. This result may shed 
a new light on the relationship between complexity and anticipation. 
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1 Introduction 

Recently, it was found that much of the information contained in station- 
ary time series can be captured by orderings between values, not the values 
themselves [T] . The permutation entropy rate which was first introduced in 
[5j[6] quantifies the average uncertainty of orderings between values per time 
unit. This is in contrast to the usual entropy rate which quantifies the aver- 
age uncertainty of values per time unit. However, surprisingly, it is known 
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that the permutation entropy rate is equal to the entropy rate for finite-state 
stationary stochastic processes [H[2]. Similar results for dynamical systems 

are also known [21 EH 02] . 

In our previous work [14] , we found a new proof of the equality between 
the permutation entropy rate and the entropy rate based on a duality be- 
tween values and orderings, which can be seen as a Galois connection |llj 
(categorical adjunction [IT] for partially ordered sets, however, we do not 
refer to the Galois connection explicitly in this paper). By making use of 
the duality, we also proved that the permutation excess entropy is equal to 
the excess entropy for finite-state stationary ergodic Markov processes. The 
excess entropy has attracted interest from the complex systems community 
for decades |1[ d El [TOl HZ1 H3 HS1 [H] . By definition, the excess entropy is 
the sum of entropy over-estimates over finite length of words [10] . However, 
it can be expressed as the mutual information between the past and future, 
namely, the mutual information between two adjacent semi- infinite blocks 
of stochastic variables. Thus, the excess entropy can be interpreted as a 
measure of global correlation present in a system. 

In this paper, based on the duality between values and orderings, we 
show that the permutation excess entropy also admit a mutual information 
expression in the space of orderings when the process is finite-state station- 
ary ergodic Markov. This result partially justifies the claim that the permu- 
tation excess entropy measures global correlation at the level of orderings 
between values present in stationary time series. 

This paper is organized as follows. In Section 2, we review the duality 
between values and orderings. In Section 3, we explain the permutation 
excess entropy. In Section 4, we present a proof of the claim that the per- 
mutation excess entropy has a mutual information expression for finite-state 
stationary ergodic Markov processes. In Section 5, we give conclusions. 

2 Duality between Values and Orderings Explained 

Let A n = {1,2, • • • ,n} be a finite alphabet consisting of natural numbers 
from 1 to n. We consider A n as a totally ordered set ordered by the usual 
'less-than-or-equal-to' relationship. 

We denote the set of all permutations of length L > 1 by Sl- Namely, 
each element ir € Sl is a bijection on the set {1, 2, • • • , L}. For convenience, 
we denote each permutation it £ Sl by a string 7r(l) ■ ■ ■ 7r(L). 

For each word := si ■ ■ ■ sl ■= (si, • • • , «l) £ Mi = A n x ■ ■ ■ x A n of 

L 
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length L > 1, we define its permutation type it G 5^ by re-ordering symbols 
si, • • • , sl in increasing order: sf is of type tt if we have s^n\ < s^^+i) and 
7r(z) < 7r(i + 1) when s^m = s^-fi+i) for i = 1, 2, • • • , L — 1. For example, 
7r(l)7r(2)vr(3)7r(4)vr(5) = 3142 for sf = 2312 because sssxs^ = 1223. 

We introduce a map : — >■ <Sj, that sends each word to its unique 
permutation type tt = 0( s f )• This map classifies or coarse-grains words of 
length L by the criterion whether they have the same permutation type. In 
general, (j) ls many-to-one map. For example, all of 111, 112, 122,222 G A\ 
have the same permutation type ir G £3 defined by 7r(l)7r(2)7r(3) = 123 
(identity on {1,2,3}). 

Now, we list the properties of the map (ft which will be used later. 

Lemma 1 For sf , t\ G A%, 4>{ s \) = ) */ an d on ^y if s k < Sj <5 i& < tj 
for all 1 < j < k < L. 

Proof. See Corollary 4 in [14J. □ 



Lemma 2 Let n > i > 1. Fix ir G <Sj,. Assume that there is no sf G -Afli 
suc/i f/mf <^>(sf ) = 7T, but there exists sf G Af such that </>(sf ) = it (When 
i = 1 we define j4j_i = j4o = 

(i) There exists a unique sf G such that </>(sf) = 7T. Moreover, if 
0(tf ) = vr /or if G i/ien i/iere exisi ci, • • • , C£ suc/i i/iai Sk+Ck = ifc 
for k = 1, ■ ■ ■ , L and < c^m < • • • < c^^l) < n — i. 

(ii) |(/> _1 (7r)| = ( L ^™p), where \X\ denotes the cardinality of a set X. 

Proof. See Lemma 5 in |14j . (ii) follows from the fact that the number of 
sequences a\ • • • satisfying < a\ < 02 < • • • < < n — i is given by a 
binomial coefficient ( □ 

For example, let (j) G <S 5 be given by 7r(l)7r(2)7r(3)7r(4)vr(5) = 24315. 
We have <£(sf) = 7r for sf = S1S2S3S4S5 = 31213 G A\. Consider if = 
^1*2*3*4*5 = 41325 G A\ and C1C2C3C4C5 = 10112. We have <j)(t\) = it and 
£2*4*3*1*5 = 12345 = 11233 + 01112 = S2S4S3S1S5 + C2C4C3C1C5. 

As a more thorough illustration of Lemma El let us write down how <j> 
sends each word to its permutation type for L = 3 and n = 1,2. 

When n = 1, the unique element 111 G A^ is mapped to 123 G £3. 
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When n = 2, we have 



Al 



111 f 

112 h 
121 h 
122 v 
211 h 
212 v 
221) 
222 




; 123 
>L32 
^213 
-231 
,312 
321. 



For example, there is no sf 6 A\ suth that <j){s\) = 132 G <S 3 . On 
the other hand, 0- 1 (132) = {121} for <j) : A% -> «S 3 . We have _1 (123) = 
{111,112,122,222} for : -»• «S 3 . Note that l^^ 1 (123) | = 4= (^V)- 

Let us introduce a map /i : <Sl — > N L , where N = {1, 2, • • • } is the set of 
all natural numbers, by the following procedure: 

(i) Given a permutation ir G Sl, we decompose the sequence 7r(l) • • • ir(L) 
into maximal ascending subsequences. A subsequence i,- • • • ij + k of a 
sequence i\---ii is called a maximal ascending subsequence if it is 
ascending, namely, ij < ij+± < • • • < ij+fe, and neither ij-iij • • -ij+k 
nor ijij+i ■ ■ ■ ij+k+i is ascending. 

(ii) If 7r(l) • • • vr(ii), 7r(zi + 1) • • • 7r(z2), • • • , 7r(ifc_i + 1) • • • vr(L) is a decom- 
position of 7r(l) • • • 7r(L) into maximal ascending subsequences, then 
we define a word sf G N L by 

s tt(1) = • • • = Stt^j) = 1, Stt^j+i) = • • • = S^(j 2 ) = 2, • • • , S 7r (j fe _ 1 )+l = • • • = S,r(L) = k. 

We define /u(7r) = sf. 

By construction, we have o /x(7r) = it when /x(7r) G ^ for all it G <Sl- 
For example, a decomposition of 15423 G S§ into maximal ascending 

subsequences is 15,4,23. We obtain fi(ir) = S1S2S3S4S5 = 13321 by putting 

S1S5S4S2S3 = 11233. 

The map \i can be seen as a dual to the map <p in the following sense: 

Theorem 3 Let us put 



X = {s{ e A^\(/)- 1 (it) = {s{} for some it eS L }, 
Y = {neS L \\r 1 M\ = l}. 



(1) 
(2) 
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Then, <j> restricted on X is a map into Y , fx restricted on Y is a map into 
X, and they form a pair of mutually inverse maps. Furthermore, we have 

X = {sf e A%\1 < V« < n - 1 1 < 3j < k < L s. t. sj = i + 1, s k = i] (3) 



Proof. See Theorem 9 in 

For the map <fi : — ► the duality 



□ 



X ^ Y 



(4) 



is given by 




3 Permutation Excess Entropy 

Let S = {Si, £2, ■ ■ ■ } be a finite-state stationary stochastic process, where 
each stochastic variable Si takes its value in A n . By stationarity, we mean 

Pr{Si = st,--- ,S L = s L } = Pr{S k+1 = sj, • • • , S k+L = s L } 

for any k, L > 1 and s\, • • • ,sl £ A n . Hence, we can define the probability 
of occurrence of each word s± £ by p{s\ ) := p(s\ • • • sl) := Pr{Si = 
si, ■ ■ ■ ,S L = s L }. 

The entropy rate h(S) of a finite-state stationary stochastic process S = 
{Si, S2, • • • }, which quantifies the average uncertainty of values per time 
unit, is defined by 

h(S) = lim jH(S^), (5) 

where H(S{) = H(Si,--- ,S L ) = -E s ^P( s f) lo ^P(sf). The limit 
exists for any finite-state stationary stochastic process [8]. 

The permutation entropy rate quantifies the average uncertainty of or- 
derings between values per time unit. It is defined by 

h*(S) = lim \H*{S{) (6) 

L— >oo Li 
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if the limit exists, where H*(Si) = H*(S\, • • • , Sl) = — YlweS L ?'( 7r ) ^°&2P{ 7r ) 
andp(-7r) is the probability that tt is realized in S, namely, p{ir) = X^e^- 1 ^) p( s i) 
for 7r € Sl. 

Theorem 4 For any finite-state stationary stochastic process S, the per- 
mutation entropy rate h*(S) exists and 

h*(S) = h(S). (7) 



Proof. The proof appealing to ergodic theory is found in [H[2]. For an al- 
ternative proof based on the duality between values and orderings, see [T3] . 
□ 

The entropy rate can be seen as a measure of randomness of a finite- 
state stationary stochastic process. Meanwhile the excess entropy can be 
interpreted as a measure of complexity [12]. More precisely, it measures 
global correlation present in a system. The excess entropy E(S) of a finite- 
state stationary stochastic process S is defined by [10] 

E(S) = lim (H(Sh - h(S)L) (8) 
if the limit exists. If E(S) exists, then we have |1U| 

oo 

E(S) = (HiSLlSt 1 ) ~ h(S)) = lim I (Sf ; S 2 L L +1 ) , (9) 

L=l 

where H(Y\X) is the conditional entropy of Y given X and I(X; Y) is the 
mutual information between X and Y for stochastic variables X and Y. 

The permutation excess entropy was introduced in p3] by imitating the 
definition of the excess entropy. The permutation excess entropy E*(S) of 
a finite-state stationary stochastic process S is defined by 

E*(S) = lim (H*(Sf) - h*(S)L) , (10) 

L— too 

if the limit exists. However, it is unclear what form of correlation the per- 
mutation excess entropy quantifies from this expression. In the following 
discussion, we partially resolve this problem. We will show that the equality 

E*(S)= lim I{<f>{Sx)-M^rx)) (11) 
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holds for any finite-state stationary ergodic Markov process S. Recall that 
the entropy rate and the excess entropy of a finite-state stationary Markov 
process S are given by h(S) = - Yh,3=i ViVij log 2 Pij and E(S) = - Pi lo S2 Pi+ 
Y^l j=iPiPij l°g2Pij; respectively, where P = (pij) is a transition matrix and 
p = (pi, ■ ■ ■ ,p n ) is a stationary distribution. P and p satisfy p^j > for 
all 1 < i, j < n, Y2J = iPij = 1 for all 1 < i < n, Pi > for all 1 < i < n, 
Y17=i Pi = 1 an d J2i=i PiPij = Pj f° r au 1 < i < n - The probability of occur- 
rence of each word sf £ is given by p(s\) = p S iPs 1 s 2 ' ' , Ps L _ 1 s L - A finite- 
state stationary Markov process S is ergodic if and only if its transition ma- 
trix P is irreducible |20| : a matrix P is irreducible if for all 1 <i,j< n there 
exists / > such that pfj > 0, where pfj is the (i, j)-th element of P'. For 
an irreducible non- negative matrix, stationary distribution p = (pi, ■ ■ ■ ,p n ) 
exists uniquely and satisfies pi > for all 1 < i < n. 

In our previous work |14], we showed that the equality 

E*(S)=E(S) (12) 

holds for any finite-state stationary ergodic Markov process. The key point 
of the proof is that the probability 

q L = Pi"*) = ^2p(k) (13) 

TTeS L , TT^Y 

l* _l WI>i 

diminishes exponentially fast as L — > oo for any finite-state stationary er- 
godic Markov process, where the set Y is given by (J2]) in Theorem [3j For 
the proof of the equality (fTT|) . we also appeal to this fact. Hence, we shortly 
review the reason why this fact follows. 

Let L be a positive integer. We introduce the following probability /3 S 
for each symbol s 6 A n : 

Ps = Pr{sf \sj s for any 1 < j < N}, (14) 

where N = [L/2\ and [^J is the largest integer not greater than x. 

Lemma 5 (Lemma 12 in [14J) Let S be a finite- state stationary stochas- 
tic process and e be a positive real number. If f3 s < e for any s £ A n , then 
Ql < 2ne. 

Proof. We shall prove YI-k&y Pi^) — 1 ~~ ^ne, where the set Y is given by ([2]) 
in Theorem [3l Let us consider a word s\ € satisfying the following two 
conditions: 
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(i) Each symbol s G A n appears in at least once. 

(ii) Each symbol s € A n appears in s^ +1 at least once. 
By the assumption of the lemma, we have 

Pr{sf |(i) holds} > 1 -ne, 

because 

n 

Pr{sf |(i) holds} + ^ Pr{sf \sj ^ s for any 1 < j < N} > 1. 

s=l 

Similarly, 

Pr{s^ +1 |(ii) holds} > 1 - ne 

holds because of the stationarity. Hence, we obtain 

Pr{sf |both (i) and (ii) hold} > 1 - 2ne. 

Since a word £ satisfying both (i) and (ii) is a member of the set X 
given by (Q]) in Theorem [31 we obtain 

^p(vr) = p( s i) > Pr{sf |both (i) and (ii) hold} > 1 - 2ne. 

□ 

Let S be a finite-state stationary ergodic Markov process whose transi- 
tion matrix is P and stationary distribution is p. We can write /3 S in the 
following form by using Markov property: 

Ps = P( S 1---PN)= ^ P Sl Ps 1S 2 ■■■Ps N - 1 s N = ((Pa^^UgjP), (15) 

l<j<N l<j<N 



where a matrix P s is defined by 



if i = s 
Pij otherwise, 



a vector u s = (u%, ■ ■ ■ ,u n ) is defined by ui = if i = s otherwise m = 1 and 
(• • • , • • • ) is the usual inner product in the n-dimensional Euclidean space. 



S 



We can prove that the non-negative largest eigenvalue A of P s is strictly 
less than 1 and absolute value of any other eigenvalue of P s is not greater 
than A by using Perron-Frobenius Theorem for non-negative matrices and 
the irreducibility of P (Lemma 13 in [H]). Hence, by decomposing P s into 
a sum of a diagonalizable matrix and a nilpotent matrix, we obtain the 
following lemma: 

Lemma 6 Let S be a finite-state stationary ergodic Markov process. There 
exists < a < 1, C > and a positive integer k such that f3 s < Ca L L k for 
any s S A n and sufficiently large L. 

4 Mutual Information Expression of Permutation 
Excess Entropy 

In this section, we give a proof of the equality (|lip for finite-state stationary 
ergodic Markov processes. We make use of the notions of rank sequences 
and rank variables which are introduced in [2] . 

Rank sequences of length L are words r\ £ N L satisfying 1 < r{ < i for 
i = 1, • • • ,L. We denote the set of all rank sequences of length L by TZl- 
Clearly, \TZ L \ =L\ = \S L \. 

We can transform each word € A% into a rank sequence r\ € TZl by 
defining 

i 

n = 5(sj < 3j), i = !,■■■ , L, (16) 

3=1 

where 5(X) = 1 if the proposition X is true, otherwise 6(X) = 0. Namely, 
Ti is the number of indices j (1 < j <i) such that Sj < S{. Thus, we obtain 
a map tp : — > TZl such that tp(sf ) = r\. 

We can show that the map tp : A 1 ^ — >• TZl is compatible with the map 
4> ■ A% — > Sl- Namely, there exists a bijection i : TZl Sl satisfying 

L o tp = <j) [Tg] . 

Given a stationary stochastic process S = {S\, S2, ■ ■ ■ }, its associated 
rank variables are defined by Ri = Y^j=i $ (Sj — &i) for j = 1, 2, ■ ■ ■ . Note 
that rank variables Ri (i = 1,2, •••) are not stationary stochastic variables 
in general. By the compatibility between <f> and p, we have 

H(R[) = H*(S[) = H(^)) (17) 

for L > 1. 
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Now, let S be a finite-state stationary ergodic Markov process. By (|12p. 
we know that the permutation excess entropy E*(S) exists. By ()17p and 
chain rule, we have 



E*(S) 



lim (H*(S^) 



L— >oo 



h*(S)L) 



lim (H{R[) - h*(S)L) (18) 

L— >oo 
oo 



L=l 



Since the infinite sum in f)19|) converges, we obtain 



\H(R 2 L L +1 \Rf)-h*(S)L\ 



L 
i=l 



Y,[H(RL+i\Rl 



h*(S) 



L—too 



0. (19) 



By the definition of mutual information, we have I(cj)(S± ); (f>(S'j^ 1 )) 
H(HS 2 L L +1 )) - H(<t>(S 2 L L +1 )\<p(St)). By stationarity of S, H^(S 2L +1 )) 
H(<f>(S£)) = ii"*(Sf). Hence, it is sufficient to show that 



H(^Si L +1 MSt))-h*(S)L\ 



L—too 



(20) 



to prove the equality (fTTj) . However, by (|T9j) . this reduces to showing that 



iH^Sl^MS^-HiRf^lRf)] 

which is equivalent to showing that 

\H{^S{U{S? +1 )) - H{<f>{Sl L ))\ 

by (0T 



L— >oo 



L— ¥oo 



0. 



(21) 



(22) 



Lemma 7 For sj L ,tj L € A 2L , if (j}{s\ L ) = 4>(tj L ), then (p(sf) = (p(tf) and 
4>(s 2 L L + i) = 4>{t 2 i +l ). Namely, the partition of A 2 ^ by the map 4> ■ A 2 ^ S 2 l 
is a refinement of the partition of A^ x A^ = A 2 ^ by the map (f> X (f> : 
A^xA^S L x S l . 



Proof. The claim follows immediately from Lemma [TJ 



□ 
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Lemma 8 



< H^Sf^-H^iS^^iSll,)) 
( \ 



(23) 



< 



7r',n"eS L , 

\|0- 1 (t')|>1 or |0- 1 (t")|>1 / 



2nlog 2 (i + n) 



ZioWs /or any finite-state stationary stochastic process S, where 

„2L\ 



p(vr',vr")= E pC? 



4+1 e*-V) 



/or vr',vr" G 5 L . 

Proof. By Lemma [71 we can write 

= - E pW^pW + E p( 7r/ ' 7r ") 1 °g2P( 7i " / ) 7r ") 



E 



E 

7r',7r"e5 L 



n',Tr"<=S L 



E pM lo S2 pOO + p<y > O lo g2 pt 71 "', O 



\ (0X^)-l(7T',7T") 
/ 



/ 



E P{^) l °ZiPi^) + E P^i^Pf! 71 "'* 71 "") 

(^X^-^tt'.tt") 



\ ^X^-^tt'.tt") 



E p^v 

p(7r',7r")>0 

By Lemma [2] (ii) , we have 



E 



p(tt',it") p(7r',7r") 



/ 



< 



E 



&X<^)- 1 (7r',7r") 



p(Tr,w) p{ir,ir") 
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If I^V)! = 1 and |0 _1 (7r")| = 1 hold for (tt'.tt") G 5 l x S l , then 
|(0 x 0) _1 (7r / , 7r")[ = 1. In this case, if p{ir' > 0, then we have 



log 



p(?r) 



(^X0)- 1 (7r',7r") 



P(tt', TT") p(ft', Tf") 



0. 



□ 



Lemma 9 ( tl?i?p ZioZds /or any finite-state stationary ergodic Markov process 
S. 

Proof. We have 

^ p<y,ir") < ^2 p(tt',7t")+ pW^") 

|^- 1 ( 7 r')|>l or |</-- 1 (7r")|>l 7r"e5 L tt'gSl 

= 2 £ p(7T')=20 L . 

|0- 1 (tt')|>1 

By Lemma [5] and Lemma [6l there exist 0<a<l,C>0 and k > such 
that gj, < Ca L L k for sufficiently large L if S is a finite-state stationary 
ergodic Markov process. The claim follows from Lemma [H 

□ 

Thus, we get our main theorem in this paper: 
Theorem 10 The equality [77]) 

E*(S)= lim I{<P(S[)-A(S 2 L L +l )) 

holds for any finite-state stationary ergodic Markov process S. 



5 Conclusions 

In this paper, we showed that the permutation excess entropy is equal to the 
mutual information between the past and future in the space of orderings for 
finite-state stationary ergodic Markov processes. We hope that our result 
gives rise to a new insight into the relationship between complexity and 
anticipation. 
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